Results

Species Richness

Richness Summary

Give a basic summary of species richness stats. For example, the regions with the lowest and highest long-term averages in species richness are:

kable(comm_master[,mean(reg_rich), by='reg'][reg%in%c("gmex","shelf")])
reg V1
gmex 142.28522
shelf 46.29991

Can also measure long-term variability of the different regions. Here is the standard deviation for each region:

kable(comm_master[,stats::sd(reg_rich),by='reg'])
reg V1
ai 2.444883
ebs 5.566457
gmex 4.266274
goa 8.506057
neus 8.237090
newf 4.365340
sa 4.035492
shelf 3.246860
wctri 7.153178

And here is the cross-region average of those long-term standard deviations:

kable(comm_master[,stats::sd(reg_rich),by='reg'][,mean(V1)])
5.313515

Figure 1. MSOM richness time series

richness_ts()
**Figure 1.** Time series of MSOM estimates of region richness. Each point is the posterior mean of regional richness in a year. Lines indicate long-term trends from fitted values of linear regression models predicting richness from time.

Figure 1. Time series of MSOM estimates of region richness. Each point is the posterior mean of regional richness in a year. Lines indicate long-term trends from fitted values of linear regression models predicting richness from time.

Figure S1. MSOM - naive scatter

naive_msom_scatter()
**Figure S1.** MSOM richness vs naive richness

Figure S1. MSOM richness vs naive richness

MSOM richness and Naive richness are pretty similar. MSOM richness is probably more accurate, or is at least more conservative b/c it has fewer significant trends. But their similarity should help justify using observed presences/ absences in other analyses.

Manuscript paragraph:
MSOM estimates of richness were greater than Naive estimates, but the two methods produced similar temporal dynamics in richness (Figure S1). Henceforth, we report MSOM richness estimates. The greatest long-term average richness was the Gulf of Mexico (142.3), lowest in the Scotian Shelf (46.3). The inter-region average of long-term standard deviations of richness was 5.3; the Aleutian Islands showed the lowest variability (sd = 2.4), and the Gulf of Alaska was the most variable (sd = 8.5).


Richness Trend

Examine trends for naive richness:

load("../../pkgBuild/results/rich_naive_trend_kendall.RData")
rich_naive_trend_kendall[reg!="wcann",BH:=p.adjust(taup, method='BH')]
rich_naive_trend_kendall <- rich_naive_trend_kendall[reg!="wcann",list(reg=reg, estimate=tau, BH=BH, p.value=taup)]

Table S1. Naive tau

kable(rich_naive_trend_kendall)
reg estimate BH p.value
ai -0.0606061 0.8370115 0.8370115
ebs 0.5010753 0.0001810 0.0000805
gmex 0.8529413 0.0000097 0.0000021
goa 0.2051282 0.4051368 0.3601216
neus 0.3368132 0.0103196 0.0080264
newf 0.7865834 0.0001112 0.0000371
sa -0.5685352 0.0002831 0.0001573
shelf 0.7048780 0.0000000 0.0000000
wctri 0.7640932 0.0045626 0.0030417

In the Naive estimates, 7 regions had significant \(\tau_b\).
Examine trends for MSOM richness:

load("../../pkgBuild/results/rich_trend_kendall.RData")
rich_trend_kendall[reg!="wcann",BH:=p.adjust(pvalue, method="BH")]
rich_trend_kendall <- rich_trend_kendall[reg!="wcann",list(reg=reg, estimate=tau, BH=BH, p.value=pvalue)]

Table 2. MSOM tau

kable(rich_trend_kendall)
reg estimate BH p.value
ebs 0.4206052 0.0028542 0.0009514
ai 0.2317340 0.3652207 0.3246406
goa 0.2419794 0.3649750 0.2838694
wctri 0.6094305 0.0417890 0.0185729
gmex 0.0976038 0.5996890 0.5996890
sa -0.2179532 0.2121777 0.1414518
neus 0.1912424 0.2121777 0.1325807
shelf 0.4524160 0.0004334 0.0000482
newf 0.7267104 0.0006228 0.0001384

In the MSOM estimates, 4 regions had significant \(\tau_{b}\).

Overall, ~half the regions show positive trends. No regions show significant negative trends (although SEUS is close, depending on the analysis). It’s a bit surprising how much removing the autocorrelation seemed to impact some of the trends (AI in particular, I think).

Manuscript paragraph:
Estimated slopes for long-term trends (Kendall’s τb) in richness were positive for most regions. For Naïve estimates, all the seven positive τ were also significantly different from 0, whereas the two negative τb, Aleutian Islands and Southeast US, were not (Table S2). Long-term trends in MSOM estimates of richness had the same sign as Naïve trends, except for Aleutian Islands, but now the only significant τb were the following four regions: Eastern Bering Sea (τb = 0.41), West Coast US (τb = 0.61), Scotian Shelf (τb = 0.45), and Newfoundland (τb = 0.72) (Table 1). Richness trends were not significant in the Gulf of Mexico, Gulf of Alaska, and Northeast US, Aleutian Islands, Southeast US.


Richness and Geographic Range

Looking for spatial footprint that will predict richness. Here I characterize the geographic distribution of a species in two ways: its density and its size. This is tricky for richness because richness is a community level trait, but I’ve characterized these two aspects of geographic distribution at the species level. In fact, these “species level” attributes are also potentially dyanmic – a species need not have fixed range density or size. So there are two levels of averaging – for each species take its long-term average value, then for each community take the average across species.

First I’ll show a figure relating range size and range density. Then I’ll present the figure of how size and density can predict richness. Finally, I’ll explore models more formally expressing the relationship between richness and size/ density.

Figure S2

rangeSizeDens()
**Figure S2**. Range density versus range size. In panel A each point is a species-region-year combination. In panel B, each point is a region-year. Range size is the proportion of sites occupied, range density the tows in occupied sites. The community metrics in B is calculated by take each species' long-term average from A, then taking the average across all species present in the community in a given year. Fitted lines in A are from a loess fit.

Figure S2. Range density versus range size. In panel A each point is a species-region-year combination. In panel B, each point is a region-year. Range size is the proportion of sites occupied, range density the tows in occupied sites. The community metrics in B is calculated by take each species’ long-term average from A, then taking the average across all species present in the community in a given year. Fitted lines in A are from a loess fit.

These plots show that there is a strong relationship between range size and range density. Interestingly, in Figure S2B, the cross-region relationship is negative (if each color had 1 pt), whereas the within-region relationship is positive.

The positive relationship between size and density is not surprising. My interpretation of density is ~population size. Often population size is correlated with range size. I think this is a standard result, but I need to double-check.

Figure 2

rich_geoRange()
**Figure 2.** Species richness vs A) geographic range size, and B) geographic range density. Both metrics are based on each species' long-term average of a statistic; range size is the proportion of sites occupied, range density is the proportion of tows in occupied sites. Solid lines are linear regressions with MSOM richness as the response and the horizontal axis and an intercept as the predictors.

Figure 2. Species richness vs A) geographic range size, and B) geographic range density. Both metrics are based on each species’ long-term average of a statistic; range size is the proportion of sites occupied, range density is the proportion of tows in occupied sites. Solid lines are linear regressions with MSOM richness as the response and the horizontal axis and an intercept as the predictors.

Both range size and range density are pretty good predictors of species richness. I think I had originally missed the range size relationship b/c I hadn’t done the same aggregating procedure. The interpretation I have is that richness is highest when you have a bunch of rare species.

Richness and Range Density

The goal here is to see if species richness is predicted by the typical range density of community’s constituent species. First I’ll run different types of models just to explore whether this is true, in general (across regions). Then I’ll drill in to each region individually to answer the same question.

# This is a function that'll help summarize model fit and coefficients and parameter significance:  
mod_smry <- function(m, pred_name=c("density","size")){
    pred_name <- match.arg(pred_name)
    sc <- sem.coefs(m)
    mod_call <- switch(class(m), lmerMod=m@call, lm=m$call)
    mod_call <- as.character(mod_call)[2]
    out <- cbind(
        mod_call = mod_call,
        sc[sc[,"predictor"]==pred_name,],
        sem.model.fits(m)
    )
    out[,c("Class","N","mod_call","predictor","estimate","std.error","p.value","Marginal","Conditional")]
}

# Make a data set that is useful for these regressions (short variable names, etc):  
range_reg <- comm_master[,list(reg, year, rich=reg_rich, density=propTow_occ_avg, size=propStrata_avg_ltAvg)]

# Fit different models to the whole data set
rich_dens_mods <- list()
rich_dens_mods[[1]] <- lm(rich ~ 1 + density, data=range_reg)
rich_dens_mods[[2]] <- lm(rich ~ 1 + density*reg, data=range_reg)
rich_dens_mods[[3]] <- lmer(rich ~ 1 + density + (1|reg), data=range_reg)
rich_dens_mods[[4]] <- lmer(rich ~ 1 + density + (1+density|reg), data=range_reg)
rich_dens_smry <- rbindlist(lapply(rich_dens_mods, mod_smry, pred_name="density"))

# Fit same model to each region separately 
rich_dens_reg_mods <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
    rich_dens_reg_mods[[r]] <- lm(rich ~ 1 + density, data=range_reg[reg==ur[r]])
}
rich_dens_reg_smry <- data.table(reg=ur, rbindlist(lapply(rich_dens_reg_mods, mod_smry, pred_name="density")))

# unlist(lapply(rich_dens_mods, stargazer, type='html'))
stargazer(rich_dens_mods[[1]], rich_dens_mods[[2]], rich_dens_mods[[3]], rich_dens_mods[[4]], type='html')
Dependent variable:
rich
OLS linear
mixed-effects
(1) (2) (3) (4)
density -94.071*** -341.590** -525.158*** -664.300***
(17.038) (140.711) (36.360) (96.041)
regebs 753.130***
(132.497)
reggmex 284.278**
(116.047)
reggoa 192.097**
(92.351)
regneus 216.220**
(88.588)
regnewf 365.078**
(150.048)
regsa 31.548
(100.847)
regshelf -30.070
(86.972)
regwctri 229.957**
(105.721)
density:regebs -802.601***
(193.269)
density:reggmex -728.052***
(264.670)
density:reggoa -535.503***
(177.692)
density:regneus -352.184**
(156.568)
density:regnewf -496.211**
(237.666)
density:regsa -21.417
(184.284)
density:regshelf 48.959
(149.262)
density:regwctri -444.658**
(193.577)
Constant 143.267*** 251.093*** 375.585*** 455.269***
(9.843) (81.730) (28.308) (73.939)
Observations 197 197 197 197
R2 0.135 0.990
Adjusted R2 0.131 0.990
Log Likelihood -571.773 -554.875
Akaike Inf. Crit. 1,151.546 1,121.749
Bayesian Inf. Crit. 1,164.679 1,141.448
Residual Std. Error 29.771 (df = 195) 3.259 (df = 179)
F Statistic 30.485*** (df = 1; 195) 1,096.211*** (df = 17; 179)
Note: p<0.1; p<0.05; p<0.01
kable(rich_dens_smry) # different kinds of models
Class N mod_call predictor estimate std.error p.value Marginal Conditional
lm 197 rich ~ 1 + density density -94.0708 17.03765 0.0000 0.1351986 NA
lm 197 rich ~ 1 + density * reg density -341.5902 140.71094 0.0162 0.9904861 NA
lmerMod 197 rich ~ 1 + density + (1 | reg) density -525.1580 36.35966 0.0000 0.5358370 0.9981927
lmerMod 197 rich ~ 1 + density + (1 + density | reg) density -664.3002 96.04126 0.0000 0.3493638 0.9994551
kable(rich_dens_reg_smry) # same model applied to each reg sep
reg Class N mod_call predictor estimate std.error p.value Marginal Conditional
ai lm 12 rich ~ 1 + density density -341.5902 24.24775 0.0000 0.9520286 NA
ebs lm 31 rich ~ 1 + density density -1144.1910 88.47021 0.0000 0.8522400 NA
gmex lm 17 rich ~ 1 + density density -1069.6426 124.78482 0.0000 0.8304652 NA
goa lm 13 rich ~ 1 + density density -877.0932 132.51898 0.0000 0.7992927 NA
neus lm 32 rich ~ 1 + density density -693.7744 122.76529 0.0000 0.5156318 NA
newf lm 16 rich ~ 1 + density density -837.8009 142.75834 0.0000 0.7109900 NA
sa lm 25 rich ~ 1 + density density -363.0073 130.10022 0.0104 0.2528899 NA
shelf lm 41 rich ~ 1 + density density -292.6311 18.11578 0.0000 0.8699704 NA
wctri lm 10 rich ~ 1 + density density -786.2482 136.01170 0.0004 0.8068424 NA
kable(as.data.frame(as.list( # avg of above
    sapply(rich_dens_reg_smry, function(x)suppressWarnings(mean(x)))
))) 
reg Class N mod_call predictor estimate std.error p.value Marginal Conditional
NA NA 21.88889 NA NA -711.7754 102.197 0.0012 0.7322612 NA

All models are pretty good predictors. Well, the most basic model kinda sucks I guess. It needs to account for some of the between-region variation.

Richness and Range Size

Same as above, but now let’s look at range size as a predictor of species richness.

“Is range size a good predictor of species richness?”

rich_size_mods <- list()
rich_size_mods[[1]] <- lm(rich ~ 1 + size, data=range_reg)
rich_size_mods[[2]] <- lm(rich ~ 1 + size*reg, data=range_reg)
rich_size_mods[[3]] <- lmer(rich ~ 1 + size + (1|reg), data=range_reg)
rich_size_mods[[4]] <- lmer(rich ~ 1 + size + (1+size|reg), data=range_reg)
rich_size_smry <- rbindlist(lapply(rich_size_mods, mod_smry, pred_name="size"))

rich_size_reg_mods <- list()
ur <- range_reg[,unique(reg)]
for(r in 1:length(ur)){
    rich_size_reg_mods[[r]] <- lm(rich ~ 1 + size, data=range_reg[reg==ur[r]])
}
rich_size_reg_smry <- data.table(reg=ur, rbindlist(lapply(rich_size_reg_mods, mod_smry, pred_name="size")))

# unlist(lapply(rich_size_mods, stargazer, type='html'))
stargazer(rich_size_mods[[1]], rich_size_mods[[2]], rich_size_mods[[3]], rich_size_mods[[4]], type='html')
Dependent variable:
rich
OLS linear
mixed-effects
(1) (2) (3) (4)
size -41.744 -298.714*** -320.492*** -446.741***
(27.019) (108.160) (20.673) (71.160)
regebs 89.396**
(38.336)
reggmex 166.444***
(46.423)
reggoa 73.846*
(38.392)
regneus 111.614***
(37.349)
regnewf 30.344
(40.702)
regsa 58.769
(42.746)
regshelf -61.244*
(36.886)
regwctri 61.122
(39.410)
size:regebs -297.732**
(119.580)
size:reggmex -197.781
(134.793)
size:reggoa -185.302
(116.924)
size:regneus -543.256***
(118.393)
size:regnewf -256.003*
(137.719)
size:regsa 49.873
(119.451)
size:regshelf 144.584
(109.847)
size:regwctri -85.911
(116.901)
Constant 102.122*** 153.321*** 185.206*** 211.391***
(8.045) (36.441) (15.181) (21.935)
Observations 197 197 197 197
R2 0.012 0.995
Adjusted R2 0.007 0.994
Log Likelihood -562.694 -496.586
Akaike Inf. Crit. 1,133.389 1,005.172
Bayesian Inf. Crit. 1,146.522 1,024.871
Residual Std. Error 31.819 (df = 195) 2.421 (df = 179)
F Statistic 2.387 (df = 1; 195) 1,994.843*** (df = 17; 179)
Note: p<0.1; p<0.05; p<0.01
kable(rich_size_smry)
Class N mod_call predictor estimate std.error p.value Marginal Conditional
lm 197 rich ~ 1 + size size -41.74371 27.01868 0.1240 0.0120930 NA
lm 197 rich ~ 1 + size * reg size -298.71408 108.16038 0.0063 0.9947494 NA
lmerMod 197 rich ~ 1 + size + (1 | reg) size -320.49227 20.67327 0.0000 0.2932665 0.9945506
lmerMod 197 rich ~ 1 + size + (1 + size | reg) size -446.74051 71.15995 0.0000 0.3319367 0.9986230
kable(rich_size_reg_smry)
reg Class N mod_call predictor estimate std.error p.value Marginal Conditional
ai lm 12 rich ~ 1 + size size -298.7141 64.79745 0.0010 0.6800184 NA
ebs lm 31 rich ~ 1 + size size -596.4463 44.20232 0.0000 0.8626087 NA
gmex lm 17 rich ~ 1 + size size -496.4955 70.67676 0.0000 0.7668957 NA
goa lm 13 rich ~ 1 + size size -484.0165 72.54219 0.0000 0.8018673 NA
neus lm 32 rich ~ 1 + size size -841.9705 64.00081 0.0000 0.8522680 NA
newf lm 16 rich ~ 1 + size size -554.7167 57.74506 0.0000 0.8682739 NA
sa lm 25 rich ~ 1 + size size -248.8411 68.97584 0.0015 0.3613805 NA
shelf lm 41 rich ~ 1 + size size -154.1298 8.31788 0.0000 0.8980015 NA
wctri lm 10 rich ~ 1 + size size -384.6251 28.71723 0.0000 0.9573075 NA
kable(as.data.frame(as.list(sapply(rich_size_reg_smry, function(x)suppressWarnings(mean(x))))))
reg Class N mod_call predictor estimate std.error p.value Marginal Conditional
NA NA 21.88889 NA NA -451.1062 53.33061 0.0002778 0.7831802 NA

Predicting Richness: Range Size or Density?

As far as picking one or the other, it doesn’t end up mattering much. Range size is a lot better than density in NEUS, and density outperforms size in AI. Otherwise, size as a slight edge over density on average, although both predictors are significant in all regions.

rich_ds_smry <- rbind(rich_dens_reg_smry, rich_size_reg_smry)
setkey(rich_ds_smry, reg, Class, predictor)
kable(rich_ds_smry)
reg Class N mod_call predictor estimate std.error p.value Marginal Conditional
ai lm 12 rich ~ 1 + density density -341.5902 24.24775 0.0000 0.9520286 NA
ai lm 12 rich ~ 1 + size size -298.7141 64.79745 0.0010 0.6800184 NA
ebs lm 31 rich ~ 1 + density density -1144.1910 88.47021 0.0000 0.8522400 NA
ebs lm 31 rich ~ 1 + size size -596.4463 44.20232 0.0000 0.8626087 NA
gmex lm 17 rich ~ 1 + density density -1069.6426 124.78482 0.0000 0.8304652 NA
gmex lm 17 rich ~ 1 + size size -496.4955 70.67676 0.0000 0.7668957 NA
goa lm 13 rich ~ 1 + density density -877.0932 132.51898 0.0000 0.7992927 NA
goa lm 13 rich ~ 1 + size size -484.0165 72.54219 0.0000 0.8018673 NA
neus lm 32 rich ~ 1 + density density -693.7744 122.76529 0.0000 0.5156318 NA
neus lm 32 rich ~ 1 + size size -841.9705 64.00081 0.0000 0.8522680 NA
newf lm 16 rich ~ 1 + density density -837.8009 142.75834 0.0000 0.7109900 NA
newf lm 16 rich ~ 1 + size size -554.7167 57.74506 0.0000 0.8682739 NA
sa lm 25 rich ~ 1 + density density -363.0073 130.10022 0.0104 0.2528899 NA
sa lm 25 rich ~ 1 + size size -248.8411 68.97584 0.0015 0.3613805 NA
shelf lm 41 rich ~ 1 + density density -292.6311 18.11578 0.0000 0.8699704 NA
shelf lm 41 rich ~ 1 + size size -154.1298 8.31788 0.0000 0.8980015 NA
wctri lm 10 rich ~ 1 + density density -786.2482 136.01170 0.0004 0.8068424 NA
wctri lm 10 rich ~ 1 + size size -384.6251 28.71723 0.0000 0.9573075 NA
fight <- rich_ds_smry[,list(marginal_density_minus_size=.SD[predictor=="density", Marginal] - .SD[predictor=="size", Marginal]),by='reg']
kable(data.table(fight, mu=fight[,mean(marginal_density_minus_size)]))
reg marginal_density_minus_size mu
ai 0.2720102 -0.0509189
ebs -0.0103687 -0.0509189
gmex 0.0635695 -0.0509189
goa -0.0025746 -0.0509189
neus -0.3366362 -0.0509189
newf -0.1572839 -0.0509189
sa -0.1084906 -0.0509189
shelf -0.0280311 -0.0509189
wctri -0.1504652 -0.0509189

Conclusion for Richness and Geographic Range

The two metrics of geographic range are well correlated. Furthermore, richness can be predicted pretty well using regressions with either as a predictor. There are large differences among regions, though. This is probably because richness is not readily comparable among most regions. Regions vary mostly in their intercept values, and they have fairly similar slopes (though they are not identical, and model fits improve when allowing slopes to vary among regions; it’s just that the improvement is small compared to allowing intercepts to vary among regions).

The interpretation of the result that geographic distribution predicts species richness is likely associated with species rarity. When the average range density or range size of a community is low, it means it has a lot of species that are rare (at either spatial scale). It’s these rare species that come and go, and form the dynamics of richness that we observe. When that dynamical value is high, it implies that an above-average number of the dynamic species are present. Because those species are transient (dynamic), they are also probably rare.


Colonization and Extinction

Colonization and Extinction Summary

Figure S3. Barplot of categories

categ_barplot()
# And here's the table behind the barplot:  
kable(t(
    spp_master[!duplicated(paste(reg,ce_categ,spp)), table(reg, ce_categ)]
)[c(4,1,2,3),])
ai ebs gmex goa neus newf sa shelf wctri
neither 39 64 105 63 56 49 69 27 64
both 6 44 31 17 83 12 35 20 15
colonizer 9 2 8 18 1 10 0 1 12
leaver 1 0 0 0 1 1 0 0 1
# Or, to see the most common categories overall:  
kable(data.frame(as.list(
    rowSums(spp_master[!duplicated(paste(reg,ce_categ,spp)), table(reg, ce_categ)][,c(4,1,2,3)])
)))
ai ebs gmex goa neus newf sa shelf wctri
55 110 144 98 141 72 104 48 92
# Does that pattern change if we just look at regions with positive slopes? 
pos_reg <- c("ebs","newf","shelf","wctri")
kable(data.frame(as.list(
    rowSums(t(
        spp_master[reg%in%pos_reg & !duplicated(paste(reg,ce_categ,spp)), table(reg, ce_categ)]
    )[c(4,1,2,3),])
)))
neither both colonizer leaver
204 91 25 2
kable(data.frame(as.list(
    rowSums(t(
        spp_master[!reg%in%pos_reg & !duplicated(paste(reg,ce_categ,spp)), table(reg, ce_categ)]
    )[c(4,1,2,3),])
)))
neither both colonizer leaver
332 172 36 2
**Figure S3.** Number of species beloning to the categories of both, neither, colonizer, leaver in each region

Figure S3. Number of species beloning to the categories of both, neither, colonizer, leaver in each region

It’s the same pattern, whichever way you split it. However, AI is the only region that had more colonizers than both species. An interesting way to think about some of this is that the average sd in richness was 5.3135146, so when the number of colonizer or leaver species exceed’s that region’s sd, the impact of those categories, which I consider to be dubious, might start being relevant (though it’s not necessarily problematic, nor is this even close to an actual test for the significance of those categories to the trend). EBS and Shelf had significant positive trends in richness and very low numbers in the colonizer category. WCTRI and NEWF had similar numbers in the both and colonizer category.

Figure S4. Time series of colonizations and extinctions

col_ext_ts()
**Figure S4.** Number of colonizations (blue) and extinctions (red) over time in each region.

Figure S4. Number of colonizations (blue) and extinctions (red) over time in each region.

In most regions the differences in colonization and extinction numbers are similar over time. The most obvious exceptions are for the 3 regions that showed large initial spikes in richness; the GOA, GMEX, and AI regions initially have much larger numbers of colonizers than leavers, but this number shrinks rapidly until the two rates are ~equal.

For the regions with significant positive slopes, there is no visually obvious increase in colonizations relative to extinctions over time. Because the colonization and extinction numbers tend to track each other over the long-term, it it would be difficult to attribute the long-term changes in richness to a change in just colonization or extinction rates.

Manuscript paragraph:
A time series of richness can be decomposed into the colonizations and extinctions of individual species over time. We categorized species according to the following colonization extinction patterns: present in all years = neither (536 species), colonized and went extinct = both (263 species), initially absent but present every year after its colonization = colonizer (61 species), initially present but absent every year after its extinction = leaver (4 species). Most regions had the same overall ranking (neither > both > colonizer > leaver), except in the Northeast US where both was the most common and neither second, and in the Aleutian Islands where colonizer was the second most common and both third (Figure S4). In general, changes in richness were not due to permanent departures or introductions of species to the region. Furthermore, colonization and extinction rates did not become more dissimilar over time for any region (Figure S5). Colonizations were initially greater than extinctions in Aleutian Islands, Gulf of Alaska, and Gulf of Mexico, but this difference disappeared in the latter portion of these time series, as evidenced by these regions’ initially rapid increase in richness that later plateaued. The other regions did not show strong trends in the difference between colonizations and extinctions over time, making it difficult to attribute the long-term trends in richness to changes in just colonization or just extinction rates.

Total Colonizations + Extinctions and Geographic Range

The result that richness is predicted by geographic range implied an underlying association between range, colonization/ extinction, and richness itself. This paper begins by explaining richness with range. It will end by explaining how range changes near a colonization/ extinction event. Here, between the two, I’ll show how the number of colonizations is related to range.
####Figure S5

ceEventRange()
**Figure S5.** Number of colonizations and extinctions as a function of range size and range density.

Figure S5. Number of colonizations and extinctions as a function of range size and range density.

Yup, this is definitely a thing. Long-term average range size and range density predict how many colonizations and extinctions a species is likely to have. This will lead nicely into examing how range changes prior to an extinction or after a colonization.


Spatial Clustering of Colonization and Extinction

Figure 3. Colonization map

ceRate_map(ce="colonization")
**Figure 2.** Maps of long-term averages of colonizations per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of colonization rate were smoothed using a Gaussian kernel smoother. The smoothed colonization rate is indicated by the color bars in each panel; colors are scaled independently for each region.

Figure 2. Maps of long-term averages of colonizations per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of colonization rate were smoothed using a Gaussian kernel smoother. The smoothed colonization rate is indicated by the color bars in each panel; colors are scaled independently for each region.

Figure S7. Extinction map

ceRate_map(ce="extinction")
**Figure S7.** Maps of long-term averages of extinctions per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of extinction rate were smoothed using a Gaussian kernel smoother. The smoothed extinction rate is indicated by the color bars in each panel; colors are scaled independently for each region.

Figure S7. Maps of long-term averages of extinctions per site per decade for each region: A) E. Bering Sea, B) Gulf of Alaska, C) Aleutian Islands, D) Scotian Shelf, E) West Coast US, F) Newfoundland, G) Gulf of Mexico, H) Northeast US, I) Southeast US. Values of extinction rate were smoothed using a Gaussian kernel smoother. The smoothed extinction rate is indicated by the color bars in each panel; colors are scaled independently for each region.

Hotspots can be seen in most regions. Newfoundland also has high values around its edge (as opposed to interior), it seems. NEUS and Gmex show very strong hotspots, and other locations tend to be much much lower. Other regions show more of a continuum.

rel_col_ext_rate <- mapDat[,j={
    map_smooth_col <- spatstat::Smooth(spatstat::ppp(x=.SD[,lon], y=.SD[,lat], marks=.SD[,n_spp_col_weighted], window=mapOwin[[reg]]), hmax=1)
    mark_range_col <- range(map_smooth_col, na.rm=TRUE)*10
    
    map_smooth_ext <- spatstat::Smooth(spatstat::ppp(x=.SD[,lon], y=.SD[,lat], marks=.SD[,n_spp_ext_weighted], window=mapOwin[[reg]]), hmax=1)
    mark_range_ext <- range(map_smooth_ext, na.rm=TRUE)*10
    ol <- list(
        minval_col=mark_range_col[1], maxval_col=mark_range_col[2], max_o_min_col=do.call("/",as.list(rev(mark_range_col))),
        minval_ext=mark_range_ext[1], maxval_ext=mark_range_ext[2], max_o_min_ext=do.call("/",as.list(rev(mark_range_ext)))
    )
    lapply(ol, function(x)if(is.numeric(x)){signif(x,3)}else{x})
},by=c("reg")]
kable(rel_col_ext_rate, caption="Table. The colonization and extinction intensity range and max/min ratio")
Table. The colonization and extinction intensity range and max/min ratio
reg minval_col maxval_col max_o_min_col minval_ext maxval_ext max_o_min_ext
ebs 1.16e-01 0.554 4.79 1.20e-01 0.546 4.56e+00
ai 5.66e-02 0.483 8.53 0.00e+00 0.174 1.37e+09
goa 1.17e-03 0.764 652.00 1.75e-02 0.567 3.24e+01
wctri 2.32e-02 1.160 50.10 9.21e-03 0.951 1.03e+02
gmex 5.08e-01 3.180 6.25 7.27e-02 3.280 4.51e+01
sa 1.08e+00 2.180 2.03 7.58e-01 3.690 4.87e+00
neus 1.06e-01 22.500 213.00 8.73e-02 21.100 2.42e+02
shelf 4.81e-05 1.220 25400.00 5.74e-05 1.050 1.82e+04
newf 3.55e-02 0.582 16.40 1.07e-02 0.612 5.71e+01
kable(rel_col_ext_rate[,lapply(.SD, median)], caption="Table. Median of above table")
Table. Median of above table
reg minval_col maxval_col max_o_min_col minval_ext maxval_ext max_o_min_ext
neus 0.0566 1.16 16.4 0.0175 0.951 57.1

Figure S8. Colonization neighborhood

nb_moranI(ce="colonization")
**Figure S8.** Connectivity and local spatial autocorrelation of colonization events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.

Figure S8. Connectivity and local spatial autocorrelation of colonization events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.

Figure S9. Extinction neighborhood

nb_moranI(ce="extinction")
**Figure S9.** Connectivity and local spatial autocorrelation of extinction events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.

Figure S9. Connectivity and local spatial autocorrelation of extinction events in each region. Each site is represented by a point. Points connected by a line are neighbors. For each region, neighbors were determined by first calculating the minimum distance required to allow each site to have at least 1 neighbor. Neighbors of a focal point were then defined as the points within this minimum distance from the focal point. Local spatial autocorrelation is local Moran’s I, significant LMI is indicated by a solid point, the color of which indicates the value of the LMI statistic. The outline is the region boundary used for smoothing in Figure 3 (main text), but does not affect calculations of LMI.


Geographic Range Near Colonization and Extinction

Figure 4

rangeSize_absenceTime()
Figure 4. Geographic range size (A,B) and geographic range density (C,D) vs years until extinction (A,C) and years after colonization (B,D). For each unique value on the horizontal axis, the cross-species average for the range metric is displayed, and a linear model fit through this average. Statistics in main text do not use this aggregation. Extinction events are identified as occuring the year before the species is absent (?right?), colonization the first year it is present after an absence.

Figure 4. Geographic range size (A,B) and geographic range density (C,D) vs years until extinction (A,C) and years after colonization (B,D). For each unique value on the horizontal axis, the cross-species average for the range metric is displayed, and a linear model fit through this average. Statistics in main text do not use this aggregation. Extinction events are identified as occuring the year before the species is absent (?right?), colonization the first year it is present after an absence.

Range size declines near an absence much more consistently than does range density; both are (relatively) low just before extinction and just after colonization. However, range density has much more variable intercepts among regions, whereas range size does not.

This makes sense, at least somewhat, because colonization and extinction events are defined at the site level; though the outcome isn’t necessitated by this formulation, because size could drop suddenly. In fact, when a species is absent, both its range size and its range density must be 0 (though, range density is technically calculated for only those sites that are occupied, so I supposed it’s technically undefined according to the equations I’m using).

I think the regressions for range size should omit an intercept, while the regressions for range density should have it. This might be hard to justify fully a priori (though see my thinking in previous paragraph), so I’ll probably just do a model selection and maybe discuess the difference if one model has an intercept and the other does not.

Changes in Range Size Near Colonization or Extinction

rangeTimeDT <-  spp_master[!is.na(stretch_type) & propStrata!=0]
rangeTimeDT <- rangeTimeDT[,list(
    reg=reg, 
    event=as.character(event_year), 
    spp=spp, 
    type=as.character(stretch_type),
    time=ext_dist, 
    size=propStrata, 
    density=propTow_occ
)]

sizeColExt_mods <- list()
sizeColExt_mods[[1]] <- lmer(size ~ time*type + (time*type | spp/reg), data=rangeTimeDT)
sizeColExt_mods[[2]] <- lmer(size ~ time*type + (time | spp/reg) + (type|reg), data=rangeTimeDT)
sizeColExt_mods[[3]] <- lmer(size ~ time + type + (time | spp/reg) + (type|reg), data=rangeTimeDT)
sizeColExt_mods[[4]] <- lmer(size ~ time + (time | spp/reg), data=rangeTimeDT) # this is what I settled on previously i think
sizeColExt_mods[[5]] <- lmer(size ~ time + (1 | spp/reg), data=rangeTimeDT)
sizeColExt_mods[[6]] <- lmer(size ~ time + (time - 1 | spp/reg), data=rangeTimeDT)

# do.call(stargazer, c(sizeColExt_mods, list(type="html")))

Changes in Range Density Near Colonization or Extinction

densityColExt_mods <- list()
densityColExt_mods[[1]] <- lmer(density ~ time*type + (time*type | spp/reg), data=rangeTimeDT)
densityColExt_mods[[2]] <- lmer(density ~ time*type + (time | spp/reg) + (type|reg), data=rangeTimeDT)
densityColExt_mods[[3]] <- lmer(density ~ time + type + (time | spp/reg) + (type|reg), data=rangeTimeDT)
densityColExt_mods[[4]] <- lmer(density ~ time + (time | spp/reg), data=rangeTimeDT) 
densityColExt_mods[[5]] <- lmer(density ~ time + (1 | spp/reg), data=rangeTimeDT)
densityColExt_mods[[6]] <- lmer(density ~ time + (time - 1 | spp/reg), data=rangeTimeDT)

do.call(stargazer, c(sizeColExt_mods, densityColExt_mods, list(type="html")))
Dependent variable:
size density
(1) (2) (3) (4) (5) (6) (7) (8) (9) (10) (11) (12)
time 0.007*** 0.007*** 0.007*** 0.007*** 0.005*** 0.007*** 0.003*** 0.003*** 0.003*** 0.003*** 0.002*** -0.001
(0.001) (0.001) (0.001) (0.001) (0.0002) (0.001) (0.0004) (0.0005) (0.0004) (0.0004) (0.0004) (0.002)
typepre_ext 0.002 0.001 0.002 0.008 0.012 0.012
(0.003) (0.005) (0.005) (0.008) (0.009) (0.008)
time:typepre_ext -0.001 0.0004 0.001 -0.00000
(0.001) (0.001) (0.001) (0.001)
Constant 0.062*** 0.071*** 0.070*** 0.062*** 0.073*** 0.055*** 0.423*** 0.425*** 0.426*** 0.426*** 0.429*** 0.444***
(0.004) (0.010) (0.010) (0.004) (0.006) (0.002) (0.010) (0.050) (0.050) (0.010) (0.010) (0.004)
Observations 5,726 5,726 5,726 5,726 5,726 5,726 5,726 5,726 5,726 5,726 5,726 5,726
Log Likelihood 7,313.790 7,305.312 7,311.481 7,291.964 6,563.036 6,809.122 2,604.312 2,802.634 2,808.608 2,589.062 2,581.763 1,586.657
Akaike Inf. Crit. -14,577.580 -14,582.620 -14,596.960 -14,565.930 -13,116.070 -13,608.240 -5,158.624 -5,577.268 -5,591.217 -5,160.123 -5,153.525 -3,163.315
Bayesian Inf. Crit. -14,411.260 -14,489.490 -14,510.480 -14,506.050 -13,082.810 -13,574.980 -4,992.304 -5,484.129 -5,504.731 -5,100.249 -5,120.261 -3,130.051
Note: p<0.1; p<0.05; p<0.01